Indexing High Dimensional Rectangles for Fast Multimedia Identification
نویسندگان
چکیده
This paper addresses the problem of quickly performing point queries against high-dimensional regions. Such queries are useful in the increasingly important problems of multimedia identification and retrieval, where different database entries have different metrics for similarity. While the database literature has focused on indexing for high-dimensional nearest neighbor and epsilon range queries, indexing for point queries against high-dimensional regions has not been addressed. We present an efficient indexing method for these queries, which relies on the combination of redundancy and bit vector indexing to achieve significant performance gains. We have implemented our approach in a real-world audio fingerprinting system, and have obtained a factor of 56 speed-up over linear scan. Furthermore, the well-known Hilbert bulk-loaded R-Trees, a technique capable of searching low-dimensional regions, are shown to be ineffective in our audio fingerprinting system, because of the inherently high-dimensional properties of the problem.
منابع مشابه
High-dimensional Similarity Joins
A. Toga. QBISM: a prototype 3-d medical image database system. B. Seeger. The R-tree: an efficient and robust access method for points and rectangles. [7] C. Faloutsos and K.-I. Lin. Fastmap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. [8] A. Guttman. R-trees: a dynamic index structure for spatial searching. [12] D. Lomet and B. Salzberg....
متن کاملHDIdx: High-dimensional indexing for efficient approximate nearest neighbor search
Fast Nearest Neighbor (NN) search is a fundamental challenge in large-scale data processing and analytics, particularly for analyzing multimedia contents which are often of high dimensionality. Instead of using exact NN search, extensive research efforts have been focusing on approximate NN search algorithms. In this work, we present “HDIdx”, an efficient high-dimensional indexing library for f...
متن کاملیک روش مبتنی بر خوشهبندی سلسلهمراتبی تقسیمکننده جهت شاخصگذاری اطلاعات تصویری
It is conventional to use multi-dimensional indexing structures to accelerate search operations in content-based image retrieval systems. Many efforts have been done in order to develop multi-dimensional indexing structures so far. In most practical applications of image retrieval, high-dimensional feature vectors are required, but current multi-dimensional indexing structures lose their effici...
متن کاملImpact of Storage Technology on the Efficiency of Cluster-Based High-Dimensional Index Creation
The scale of multimedia data collections is expanding at a very fast rate. In order to cope with this growth, the high-dimensional indexing methods used for content-based multimedia retrieval must adapt gracefully to secondary storage. Recent progress in storage technology, however, means that algorithm designers must now cope with a spectrum of secondary storage solutions, ranging from traditi...
متن کاملTwo Dimensional Matching 10.1 Exact Matching
String matching is a basic theoretical problem in computer science, but has been useful in implementating various text editing tasks. The explosion of multimedia requires an appropriate generalization of string matching to higher dimensions. The rst natural generalization is that of seeking the occurrences of a pattern in a text where both pattern and text are rectangles. The last few years saw...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003